Ridges Based Curled Textline Region Detection from Grayscale Camera-Captured Document Images

نویسندگان

  • Syed Saqib Bukhari
  • Faisal Shafait
  • Thomas M. Breuel
چکیده

As compared to scanners, cameras offer fast, flexible and non-contact document imaging, but with distortions like uneven shading and warped shape. Therefore, camera-captured document images need preprocessing steps like binarization and textline detection for dewarping so that traditional document image processing steps can be applied on them. Previous approaches of binarization and curled textline detection are sensitive to distortions and loose some crucial image information during each step, which badly affects dewarping and further processing. Here we introduce a novel algorithm for curled textline region detection directly from a grayscale camera-captured document image, in which matched filter bank approach is used for enhancing textline structure and then ridges detection is applied for finding central line of curled textlines. The resulting ridges can be potentially used for binarization, dewarping or designing new techniques for camera-captured document image processing. Our approach is robust against bad shading and high degrees of curl. We have achieved around 91% detection accuracy on the dataset of CBDAR 2007 document image dewarping contest.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)

Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...

متن کامل

Foreground-Background Regions Guided Binarization of Camera-Captured Document Images

Binarization is an important preprocessing step in several document image processing tasks. Nowadays handheld camera devices are in widespread use, that allow fast and flexible document image capturing. But, they may produce degraded grayscale image, especially due to bad shading or non-uniform illumination. State-of-the-art binarization techniques, which are designed for scanned images, do not...

متن کامل

Dewarping of Document Images using Coupled-Snakes

Traditional OCR systems are designed for planar (dewarped) images and the accuracy is reduced when applied on warped images. Therefore, developing new OCR techniques for warped images or developing dewarping techniques are the possible solutions for improving OCR accuracy camera-captured documents. Among different types of dewarping techniques, curled textlines information based dewarping techn...

متن کامل

Performance Evaluation of Curled Textlines Segmentation Algorithms

Curled textlines segmentation is a necessary initial step for the hand-held camera-captured document image processing. Curled textlines information is often used as an intermediate step for camera-captured document image dewarping. Curled textlines information can also be used for other camera-based document image processing tasks, like layout analysis etc. So far no work has been done for the ...

متن کامل

Restoration of Arbitrarily Warped Document Images Based on Text Line and Word Detection

This paper presents a novel technique for efficient restoration of arbitrarily warped document images. Our aim is to recover document images that are mainly bounded volumes captured by a digital camera and suffer from non-linear warp. The proposed technique is applied on gray scale document images and is based on several distinct steps: an adaptive document image binarization, a text line and w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009